perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033
Open
SanderMuller wants to merge 3 commits into
Open
perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033SanderMuller wants to merge 3 commits into
SanderMuller wants to merge 3 commits into
Conversation
samsonasik
reviewed
Jun 11, 2026
d508860 to
7a94fea
Compare
samsonasik
reviewed
Jun 11, 2026
The cache only checked each file's own content, so a clean file stayed skipped on warm runs even when one of its dependencies changed, e.g. a parent class method gaining a return type that lets a child file infer its own. A fresh run reports the new change, a warm run misses it. PHPStanNodeScopeResolver now records each file's dependencies during scope resolution using PHPStan's own DependencyResolver, the same engine behind PHPStan's result cache. Cache entries store the file's own hash plus one hash per dependency, all re-validated on load; legacy string entries self-upgrade on the next write. A failed capture skips caching entirely rather than caching a partial set. Function calls memoize their dependency files per resolved name, as signature dependencies are identical at every call site. Selective runs (--only, --only-suffix) bypass the cache write, same guard as rectorphp#8029. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Files with a pending diff are never marked clean in dry-run mode, the diff must keep being reported, so every warm dry-run reprocessed them from scratch. On a 4,400-file project with 37 pending diffs that was ~11s per run. Cache the FileDiff with the file's own hash plus one hash per captured dependency; when all still match, replay the cached diff instead of reprocessing, skipping scope resolution entirely. Dry-run only: write mode always computes fresh. --no-diffs results never cross into normal entries, and the original hasChanged flag is replayed, as a rule can report line changes while printing identical content. Warm dry-run on the same project: ~9x faster single process, ~3.5x parallel. Output stays byte-identical in every cache state. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SimpleParameterProvider::hash() serializes the whole parameter bag and contentHash() runs per file, so a warm run paid the serialization once per file (~46ms per 3,200 calls with a 300-entry skip list). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7a94fea to
bd25a20
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replays cached dry-run diffs so warm dry-runs skip reprocessing unchanged files (9-150x). The replay is only sound if the cache invalidates when a dependency changes, so this also carries the dependency-aware capture it builds on — the correctness prerequisite, originally opened as #8028 and now closed in favour of landing it here. Review order: the dependency capture is the first commit, the replay is the last two.
Problem
Files with a pending diff are never marked clean in dry-run mode — correct, the diff must keep being reported — so every warm dry-run reprocesses them from scratch: parse, full PHPStan scope resolution, every rule. On real projects most warm time is exactly this. laravel/framework
src/Illuminatewith the prepared sets has 1,526 pending diffs: a warm dry-run costs the same as a cold one (220s vs 240s single process).The cache was also dependency-blind: a clean file stayed skipped on a warm run even when a dependency changed (e.g. a parent gains
: int, so a child can now infer its own return type). A diff cache keyed only on a file's own content would replay that stale diff, so the key has to see dependencies first.Change
Dependency-aware capture.
PHPStanNodeScopeResolverruns PHPStan's ownDependencyResolverper node and records the surfaced files. Cache entries become{hash, deps:{file:hash}}, re-validated on load; legacy string entries self-upgrade; a failed capture means the file is never cached (no partial sets).Diff replay. Cache the produced
FileDiffkeyed on the file's own content hash, the parameter hash and one content hash per captured dependency. When everything still matches on the next run, replay the cached diff instead of reprocessing the file — skipping scope resolution entirely:Dry-run only: write mode always computes fresh. Selective runs (
--only,--only-suffix) bypass the cache entirely.--no-diffsresults never cross into normal entries. The originalhasChangedflag is replayed, since a rule can report line changes while printing identical content. The parameter hash is memoized per process, as computing it serializes the whole parameter bag and the key needs it per file.Numbers
Output byte-identical to a fresh run in every cache state, verified per measurement. The warm gain scales with how many pending diffs a project has; a fully clean project sees no change. Cold cost is the dependency capture (~7-8% interleaved in this guard-free form); the replay itself adds nothing measurable on top.
Verification
Invalidation is covered end-to-end in tests: own-content change, dependency change (fresh-process simulation),
--no-diffscross-replay, thehasChangedflag round-trip. Replay works in parallel mode (workers save, workers replay).